Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pool balancing strategy #233

Merged

Conversation

gabriel-samfira
Copy link
Member

This change adds the ability to specify the pool balancing strategy to use when processing queued jobs. Before this change, GARM would round-robin through all pools that matched the set of tags requested by queued jobs.

When round-robin (default) is used for an entity (repo, org or enterprise) and you have 2 pools defined for that entity with a common set of tags that match 10 jobs (for example), then those jobs would trigger the creation of a new runner in each of the two pools in turn. Job 1 would go to pool 1, job 2 would go to pool 2, job 3 to pool 1, job 4 to pool 2 and so on.

When "stack" is used, those same 10 jobs would trigger the creation of a new runner in the pool with the highest priority, every time.

In both cases, if a pool is full, the next one would be tried automatically.

For the stack case, this would mean that if pool 2 had a priority of 10 and pool 1 would have a priority of 5, pool 2 would be saturated first, then pool 1.

To use this we would first need to set a priority on the pools:

ubuntu@garm:~/garm$ garm-cli pool ls -r 70227434-e7c0-4db1-8c17-e9ae3683f61e
+--------------------------------------+---------------------------+--------------+-----------------------------------------+------------------+-------+---------+---------------+----------+
| ID                                   | IMAGE                     | FLAVOR       | TAGS                                    | BELONGS TO       | LEVEL | ENABLED | RUNNER PREFIX | PRIORITY |
+--------------------------------------+---------------------------+--------------+-----------------------------------------+------------------+-------+---------+---------------+----------+
| 8ec34c1f-b053-4a5d-80d6-40afdfb389f9 | ubuntu:22.04              | default      | self-hosted x64 Linux ubuntu repo       | gsamfira/scripts | repo  | true    | garm          |        0 |
+--------------------------------------+---------------------------+--------------+-----------------------------------------+------------------+-------+---------+---------------+----------+
| 577627f4-1add-4a45-9c62-3a7cbdec8403 | runner-upstream:latest    | small        | self-hosted x64 Linux ubuntu k8s repo   | gsamfira/scripts | repo  | true    | garm          |        0 |
+--------------------------------------+---------------------------+--------------+-----------------------------------------+------------------+-------+---------+---------------+----------+

# Update priority on one pool

ubuntu@garm:~/garm$ garm-cli pool update --priority 100 577627f4-1add-4a45-9c62-3a7cbdec8403
+--------------------------+----------------------------------------------------------+
| FIELD                    | VALUE                                                    |
+--------------------------+----------------------------------------------------------+
| ID                       | 577627f4-1add-4a45-9c62-3a7cbdec8403                     |
| Provider Name            | k8s_external                                             |
| Priority                 | 100                                                      |
| Image                    | runner-upstream:latest                                   |
| Flavor                   | small                                                    |
| OS Type                  | linux                                                    |
| OS Architecture          | amd64                                                    |
| Max Runners              | 20                                                       |
| Min Idle Runners         | 1                                                        |
| Runner Bootstrap Timeout | 20                                                       |
| Tags                     | self-hosted, x64, Linux, ubuntu, k8s, repo               |
| Belongs to               | gsamfira/scripts                                         |
| Level                    | repo                                                     |
| Enabled                  | true                                                     |
| Runner Prefix            | garm                                                     |
| Extra specs              |                                                          |
| GitHub Runner Group      |                                                          |
| Instances                | garm-DNj8H6ntBHAC (13ca518d-b6e1-40ea-a949-6e488503c6ab) |
+--------------------------+----------------------------------------------------------+

Now we need to switch the 70227434-e7c0-4db1-8c17-e9ae3683f61e repository to stack:

ubuntu@garm:~/garm$ garm-cli repo update --pool-balancer-type=stack 70227434-e7c0-4db1-8c17-e9ae3683f61e
+----------------------+--------------------------------------+
| FIELD                | VALUE                                |
+----------------------+--------------------------------------+
| ID                   | 70227434-e7c0-4db1-8c17-e9ae3683f61e |
| Owner                | gsamfira                             |
| Name                 | scripts                              |
| Pool balancer type   | stack                                |
| Credentials          | gabriel_org                          |
| Pool manager running | true                                 |
+----------------------+--------------------------------------+

And now, when new jobs come in, the 577627f4-1add-4a45-9c62-3a7cbdec8403 should always be preferred, until it is full.

Github currently doesn't allow us to prioritize which runners pick up jobs first, but we can at least decide which pools spin up runners first. This should at least offer some relief for issues like the one detailed here:

This change adds the ability to specify the pool balancing strategy to
use when processing queued jobs. Before this change, GARM would round-robin
through all pools that matched the set of tags requested by queued jobs.

When round-robin (default) is used for an entity (repo, org or enterprise)
and you have 2 pools defined for that entity with a common set of tags that
match 10 jobs (for example), then those jobs would trigger the creation of
a new runner in each of the two pools in turn. Job 1 would go to pool 1,
job 2 would go to pool 2, job 3 to pool 1, job 4 to pool 2 and so on.

When "stack" is used, those same 10 jobs would trigger the creation of a
new runner in the pool with the highest priority, every time.

In both cases, if a pool is full, the next one would be tried automatically.

For the stack case, this would mean that if pool 2 had a priority of 10 and
pool 1 would have a priority of 5, pool 2 would be saturated first, then
pool 1.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Without preloading the entity we're listing pools for, we don't get that
info when listing pools for a repo/org/enterprise.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
@gabriel-samfira gabriel-samfira changed the title Add pool balancing strategy [WiP] Add pool balancing strategy Mar 14, 2024
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
@gabriel-samfira gabriel-samfira changed the title [WiP] Add pool balancing strategy Add pool balancing strategy Mar 15, 2024
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
@gabriel-samfira gabriel-samfira force-pushed the add-balancing-strategy branch from 5db1a7e to ac29af6 Compare March 15, 2024 14:35
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
@gabriel-samfira gabriel-samfira merged commit 6bfcddc into cloudbase:main Mar 15, 2024
4 checks passed
@gabriel-samfira gabriel-samfira deleted the add-balancing-strategy branch March 15, 2024 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant